Empirical Risk Minimization for Stochastic Convex Optimization: O(1/n)- and O(1/n)-type of Risk Bounds

نویسندگان

  • Lijun Zhang
  • Tianbao Yang
  • Rong Jin
  • YANG JIN
چکیده

Although there exist plentiful theories of empirical risk minimization (ERM) for supervised learning, current theoretical understandings of ERM for a related problem—stochastic convex optimization (SCO), are limited. In this work, we strengthen the realm of ERM for SCO by exploiting smoothness and strong convexity conditions to improve the risk bounds. First, we establish an Õ(d/n + √ F∗/n) risk bound when the random function is nonnegative, convex and smooth, and the expected function is Lipschitz continuous, where d is the dimensionality of the problem, n is the number of samples, and F∗ is the minimal risk. Thus, when F∗ is small we obtain an Õ(d/n) risk bound, which is analogous to the Õ(1/n) optimistic rate of ERM for supervised learning. Second, if the objective function is also λ-strongly convex, we prove an Õ(d/n+ κF∗/n) risk bound where κ is the condition number, and improve it to O(1/[λn] + κF∗/n) when n = Ω̃(κd). As a result, we obtain an O(κ/n) risk bound under the condition that n is large and F∗ is small, which to the best of our knowledge, is the first O(1/n)-type of risk bound of ERM. Third, we stress that the above results are established in a unified framework, which allows us to derive new risk bounds under weaker conditions, e.g., without convexity of the random function. Finally, we demonstrate that to achieve an O(1/[λn] + κF∗/n) risk bound for supervised learning, the Ω̃(κd) requirement on n can be replaced with Ω(κ), which is dimensionality-independent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Empirical Risk Minimization for Stochastic Convex Optimization: $O(1/n)$- and $O(1/n^2)$-type of Risk Bounds

Although there exist plentiful theories of empirical risk minimization (ERM) for supervised learning, current theoretical understandings of ERM for a related problem—stochastic convex optimization (SCO), are limited. In this work, we strengthen the realm of ERM for SCO by exploiting smoothness and strong convexity conditions to improve the risk bounds. First, we establish an Õ(d/n+ √ F∗/n) risk...

متن کامل

Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back

In stochastic convex optimization the goal is to minimize a convex function F (x) . = Ef∼D[f(x)] over a convex set K ⊂ R where D is some unknown distribution and each f(·) in the support of D is convex over K. The optimization is commonly based on i.i.d. samples f, f, . . . , f from D. A standard approach to such problems is empirical risk minimization (ERM) that optimizes FS(x) . = 1 n ∑ i≤n f...

متن کامل

[hal-00831977, v1] Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk. We focus on problems without strong convexity, for which all previously known algorithms achieve a convergence rate for...

متن کامل

Bundle Methods for Machine Learning

We present a globally convergent method for regularized risk minimization problems. Our method applies to Support Vector estimation, regression, Gaussian Processes, and any other regularized risk minimization setting which leads to a convex optimization problem. SVMPerf can be shown to be a special case of our approach. In addition to the unified framework we present tight convergence bounds, w...

متن کامل

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure

Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions. However, in the context of empirical risk minimization, it is often helpful to augment the training set by considering random perturbations of input examples. In this case, the objective is no longer a finite sum, and the main candidate for optimization is the stochas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017